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Abstract 

For some fixed alphabet A with \A\ > 2, a language i C A* is in the class Hxji of the 
Straubing-Therien hierarchy if and only if it can be expressed as a finite union of languages 
A* a\A* a2A* ■ ■ ■ A*a„yl*, where a,; £ A and n > 0. The class £i is defined as the boolean 
closure of £1/2- It is known that the classes L^ji and L\ are decidable. We give a membership 
criterion for the single classes of the boolean hierarchy over £1/2- From this criterion we can 
conclude that this boolean hierarchy is proper and that its classes are decidable. In finite model 
theory the latter implies the decidability of the classes of the boolean hierarchy over the class 
El of the FO[<]-logic. Moreover we prove a "forbidden-pattern" characterization of £1 of 
the type: i G £1 if and only if a certain pattern does not appear in the transition graph of a 
deterministic finite automaton accepting L. We discuss complexity theoretical consequences 
of our results. 

Classification: finite automata, concatenation hierarchies, boolean hierarchy, decidabihty 
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1 Introduction 

We contribute to the theory of finite automata and regular languages, as well as to complexity 
theory. Particularly we deal with starfree regular languages. These are languages which are con- 
structed from alphabet letters only by using boolean operations together with concatenation. Alter- 
nating these two kinds of operations in order to distinguish between combinatorial and sequential 
aspects leads to the definition of concatenation hierarchies that exhaust the class of starfree lan- 
guages. 



Prominent examples are the dot-depth hierarchy, first studied in [ ]CB71p , and the Straubing- 
Therien hierarchy [ StrSl , TheSl, [Str85 ]. Both are known to be strict [ BK78 ] and closely related 
to each other. Most naturally arising questions concerning these hierarchies are of major interest 
in different research areas since there are close connections to finite model theory, theory of finite 
semigroups, topology, boolean circuits and others. For an overview or as a good starting point to 
this rich field of research see e.g. the articles [ prz76 , Pin96a, Pin96b, rho96| ]. 
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In this paper we deal with the so-called Straubing-Therien hierarchy. Let A be some fi- 
nite alphabet with \A\ > 2. For a class C of languages over A* let POL(C) be its polyno- 
mial closure, i.e. the class of languages L that can be written as a finite union of languages 
LoaiLia2L2 ■ ■ ■ Ln^ittnLn, where € A, Lj € C and n > 0. Denote by BC(C) its 
boolean closure, i.e. the closure of C under finite union, finite intersection and complemen- 
tation. Then the Straubing-Therien hierarchy can be defined as the family of classes >C„/2' 
where we define Cq =dcf {0, A*}, Cri+i/2 =def POL(/:„), and =dcf BC(/:„+i/2) for 

n > (notations are adopted from [ |PW97 |). We will also consider the classes co>C„_j_i/2' where 
coC =dcf { L I L G C } for a class C. It was shown by M. Arfi in [ |Arf87| , |Arf91p that the classes 
(^rid co£„_(_i/2) closed under intersection. For a language L Q A* and a minimal n 
with L € C.n/2 we say that L has level n/2. 

The connection between first-order logic and the class of starfree languages goes back to the 
work of McNaughton and Papert [ MP71 ]. The Straubing-Therien hierarchy is related to the first- 
order logic F0[<] having only the binary relation < and unary relations for the alphabet symbols 
from A. Let be the subclass of F0[<] which is defined by at most k — 1 quantifier alternations, 
starting with an existential quantifier. It has been proved by W. Thomas in [Tho82] (see also 
[ PP86 ]) that Sfc formulas describe just the Ck~i/2 languages and that the boolean combinations 
of Sfc formulas describe just the Ck languages. 

Unfortunately one main question about the Straubing-Therien hierarchy, namely the question 
of the decidability of its classes, appears to be extremely difficult, although a lot of effort via 
different approaches has been invested. The decidability problem can be stated as follows: given 
some n > and a regular language L presented by a deterministic finite automaton, decide 
whether or not L has level n/2. To our knowledge, only levels 0, 1/2, 1, and 3/2 are known 
to be decidable (cf. [ 1PW97| ]). 

The purpose of this paper is to start with an exact analysis of what happens between level 1/2 
and level 1. Since Ci = BC(£i/2) and since BC(jCi/2) is just the union of the classes Ci/2{k) of 
the boolean hierarchy over C1/2 we study these classes Ci/2{k) and their decidability. 

J. Stern [Ste85] proved the following interesting chai-acterization of the class Ci (the class of 
piecewise testable languages over alphabet A): A language L C yl* is in £1 if and only if there 
does not exist an infinite chain wi,W2,ws,... of words where tfj+i is an extension of Wi and 
Wi ^ L <^ uij+i ^ L for f = 1, 2, 3, . . . . Let m'^{L) be the length of a maximal chain of this 
kind starting with w\ € L. Using a normal form theorem for classes of boolean hierarchies, we 
prove that L G £1/2 (^) if ^i^d orily if rn'^{L) < k. Since the latter property can be decided for 
fixed k with a nondeterministic logarithmic space algorithm, we can also decide the membership 
problem for the classes Ci/2{k) with a nondeterministic logarithmic space algorithm. Furthermore 
we show that the measure m^{L) is computable with an exponential space algorithm. Another 
consequence of the above membership criterion for the classes £1/2 (^) is the fact that this boolean 
hierarchy is indeed proper. 

As a second contribution we prove a "forbidden-pattern" characterization of Ci of the type: 
L € £1 if and only if a certain pattern (see Figure ^ does not appear in a deterministic finite 
automaton accepting L. Such characterizations were already known for the classes £x/2 and £3/2 
[PW97]. Our characterization easily provides a nondeterministic logspace decision algorithm for 

A. 

There is a close connection between concatenation hierarchies and complexity classes, both 
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related via the so-called leaf language approach to define complexity classes. This approach was 



introduced in [BCS92, Ver93] and led to a number of interesting results (cf. [HLS+93, JMT94 



BV98| , |CHVW98| ]). In particular in [ ]BV98[ ] it was shown that taking the languages from >Cfc_i/2 as 
leaf languages yields exactly the k-th class of the polynomial time hierarchy. In the last section we 
state a result of this type relating the boolean hierarchy over level 1/2 of the Straubing-Therien hi- 
erarchy to the boolean hierarchy over NP. A similar, but ineffective result concerning the boolean 
hierarchy over level 1/2 of the dot-depth hierarchy was obtained in [ BKS98 ]. Here we can make 
use of our decision algorithm, which is not known for the case of the dot-depth hierarchy. 

Finally we want to make a remark concerning our methods. First we note that the normalform 
results we use for the classes of the boolean hierarchy over C1/2 are valid also for the classes 
of the boolean hierarchy over every class Cn+i/2- This combined with the "forbidden-pattern" 
technique could work to achieve similar structural and decidability results for every level of the 
Straubing-Therien hierarchy. 



2 Preliminaries 



We consider languages over an arbitrary finite alphabet A with \ A\ > 2. For a class C of languages, 
let BC(C) be the boolean closure of C, i.e. BC(C) is the smallest class containing C and being 
closed under union, intersection and complementation. For a class C which is closed under union 
and intersection, the boolean hierarchy over C is the family of classes C{k) and coC(fc) with k > 1, 
where C{k) can be defined (besides many other equivalent possibilities, cf. [ KSW87| , |CGH"''88 ]) 
as 



c{k) =def p e c e • • • e c , 

k times 

where C © C =def {A A B \ A e C, B G C},A denotes the symmetric set difference and 
coC =dcf { r I L G C }. 

The following lemma states some well-known properties of the classes of the boolean hierar- 
chy over C. Their normal form characterization in statements 3 and 4 provides one of the other 
possibilities of their definition. 



Lemma 2.1. Let C be a class of languages which is closed under union and intersection, and let 
k>l. 



1. BC(C) = U,>iW 

2. C{k) U coC{k) C C{k + 1) n coC(A; + 1). 

3. L ^ C{2k — 1) if and only if there exist languages Li,L2, ■ ■ ■ , L2k-i G C such that 
Li^ ■■■ ^ L2k-i and L = {L2i-i\L2i) U L2k-i- 

4. L ^ C{2k) if and only if there exist languages Li, L2, ■ ■ ■ , L2k G C such that 
Li^ L2^ ■■■ ^ L2k and L = U*Li(-^2j-i\-^^2i)- 
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For a class C of languages, let POL(C) be its polynomial closure, i.e. the class of languages 
L that can be written as a finite union of languages LoaiLia2-^2 • • • Ln^iUnLn, where Oj € A, 
Li ^ C and n > 0. Then the Straubing-Therien hierarchy can be defined as the following family 



of classes, where notations are adopted from [ ]PW97p . 
1. Co =dcf {0,A*} 
2- Cn+i/2 =def POL(£„) forn > 
3. Cn+i =dcf BC{Cn+i/2) forn > 

We will also take into consideration the classes coCn+i/2- Any class Cn+i/2 can be equivalently 
defined as the closure of the class £„ under union, intersection and the so-called marked con- 
catenation (cf. [ Arf 87| , Arf91]). Consequently, the results of Lemma [2. 1| apply also to the classes 



C = jCn-^-i/2- For a language L C A* and a minimal n with L € Cn/2 we say that L has level n/2. 

Next we point out a very natural connection between the Straubing-Therien hierarchy and a 
certain logic over finite words. We define formulas using the binary relation symbol < and unary 
relation symbols tTq for each letter a ^ A. Atomic formulas are of the type x < y, x = y and iTaX, 
with variables x, y. Then formulas are contructed from atomic formulas by using the connectives 
-1, V, A and quatifiers 3, V bounding variables. Let (Iljt) be the subclass of such formulas which 
have at most k — 1 quantifier alternations, starting with an existential (universal, resp.) quantifier. 
We say a language L C A* is F0[<] -definable if there exists a sentence (j) (i.e. a formula of 
the above type without free variables) such that all words w ^ L satisfy (p when variables are 
interpreted as positions in w, iTaX means the letter at position x is a, and < is the usual <-relation 
on {1, . . . , \w\}. 



Theorem 2.2 pho82| , |PP86| |. Let k > I , and let L C A* be any language. 
L L G 'Cfe-i/2 if and only if L is FO[<]-definable by a formula. 

2. L G co£jt-i/2 ifond only if L is FO[<]-definable by a formula. 

3. L ^ Ck if and only if L is FO[<]-definable by a boolean combination of Ti^ formulas. 

Let e be the empty word. We denote by < the subword relation on A* , i.e. w < v if and 
only if there exist n > 1, ai, 02, • • • , On G A and vq,vi, . . . ,Vn ^ A* such that w = aia2 ■ ■ ■ an 
and V = VQaiVia2V2 • • • anVn- For w A* we define {w)^ =def {v \ w ^ v} as the set of all 
words having w as a subword, i.e. (0102 • • • a„)^ = ^4*01^*02^* • • • A*anA* for all n > 1 
and ai,a2, . . . ,«„ € A. Moreover, for a language L let (L)^ =dof Uii)eL(^)^ t)e the set of 
all words having a subword in L. For a word w = 0102 • • • a„ we denote with its reverse, 
i.e. =dcf CLnCLn-i ' ' ' oi, and for a language L let =dci { | it; G L }. We will denote 
infinite sequences of words {wj}^^ for short as {wi}. 

As is standard, a deterministic finite automaton (dfa) F is given by F = {A,S,5,sq,S'), 
where A is its input alphabet, S is its set of states, 5 : ^ x 5 ^ S" is its transition function, sq G S" 
is the starting state and 5' C 5" is the set of accepting states. We consider nondeterministic finite 
automata (nfa) as well, where 6 : A y, S ^ 2^ . With L{F) we denote the language accepted by 
an automaton F. As usual we extend transition functions to input words, and we denote by |F| 
the number of states of F. 
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Theorem 2.3. For every L <^ A* the following are equivalent: 

(1) L € £1/2 

(2) L is a finite union of sets {w)-^ 

(3) L is regular and (L) -< = L 

Proof. The equivalence (1) <;4> (2) is by definition, and (2) =^ (3) is obvious. For (3) =^ (2), let F 
be a dfa such that (L(F))^ = L{F). Let F' be the nfa which is constructed from F be introducing 
for every state and every a ^ A a. simple loop with a. Obviously, L{F') = L{F). Now convert 
F' into the nfa F" by removing all nontrivial loops, i.e. by keeping only the paths leading directly 
from the starting state to an accepting state. Also, L{F") = L{F'). Now, L{F") is the union of 
all (0102 • • • an)< where aia2 • • • a„ is a path in F" leading directly from the starting state to an 
accepting state. □ 

We assume the reader to be familiar with complexity classes of common interest such as NL, 
P, NP and the levels of the polynomial time hierarchy. 



3 Alternating Word Extension Chains 

We will obtain a membership criterion for the classes Ci/2{k) by examining the number of alter- 
nations that may occur in a sequence of words, where each word is an extension of its predecessor. 
Let us first make this notion precise. 



Definition 3.1 [ gteS^ ]. Let L C yl*, m > and w,v € A*. We say that v is reachable from w by 



an m- alternating word extension chain with respect to L, i.s. w =^ l v, if and only if there exist 
Wo, wi,. . . , Wm G A* such that 

1. w = wq ^ wi ^ W2 ^ ■ ■ ■ ^ Wm ^ V, and 

2. Wi G L if and only if Wi+i ^ L for 1 < i < m — 1. 

Next we take a closer look at such chains and define the sets of words that can be reached from 
a word (not) in a given language L by at least m alternations. 

Definition 3.2. For a language L C A* and m > we define 

1. L+{m) =def {v e A* \ 3w (w e L Aw =>l v) }. 

2. L^{m) =def [v ^ A* \ 3w {w ^ L f\ w v) }. 

We summarize some properties of L^{m) and L~{m) in the following proposition. 

Proposition 3.3. For a language L and m > the following statements hold: 
1. L-(m) =L^(m). 
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2. L+(0) = (L)-< andL-{0) = (L)-<. 

3. L+{m + 1) U L'{m + I) Q L^{m) n L'{m). 

4. V ^ (m) L) L~ (ni) for all m > \v\. 

6. L^{m) ^ $ implies (m + 1) C L^[m), andL^{m) ^ ^ implies L~ [m + 1) C L~{m). 

7. L~^{m) = {L'^{m))~i and L^{m) = (L~(m))-<. 

Now we show that any language L can be expressed as a possibly infinite union of set differ- 
ences of sets L^{m) and L^{m). 

Proposition 3.4. For a language L Q A* the following statements hold: 

1- L = U^>o [L+{2m)\L+{2m + 1)) and 

L = (A*\L+(0)) U U^>i {L+{2m - l)\L+(2m)) . 

2.L = U^>o {L- {2m)\L- {2m + 1)) and 

L = {A*\L-{Q)) U U^>i {L-{2m - l)\L'{2m)). 

Proof. Let m > and v € L+(2m)\L+(2m + 1). Because of v G L+(2m) there exists aw & L 

2772 2777+1 

with w =^ L V. Now observe that iiv ^ L then w =^ l v witnessed by the same word extension 
chain as before, which is a contradiction tov^ L+(2m + 1). Hence v G L. 

In the same way one proves that v € L^{2m — l)\L+(2m) implies w L for m > 1, and 
that V € ^*\L+(0) implies v ^ L. 



Statement 2 follows from 1 by Proposition 1 . □ 



Now we want to show that for a regular set L the sets L+(m) and L (m) belong to Ci/2- 



Proposition 33.7 already says that L^{m) = {L^{m))^ and L (m) = {L (m))^. With Theo- 



rem g3| it remains to show that they are regular. 



Lemma 3.5. IfLQA* is regular and m > 0, then L^{m) and L (m) are regular as well. 

Proof. Let F = {A, S, 5, sq, 5*') be a deterministic finite automaton accepting L. We construct a 
nondeterministic finite automaton Fm that accepts (m) and that realizes the idea of guessing a 
m-alternating chain of sub words of the input. Define Fm =dof i^, Sm, ^m,s'^, "S*™) as 

' '^TTi — def 5* X *S* X • • • X (S' 

(m+1)— times 

• SJT =def (so, So, • • • ,So) 

• Sm{{si,S2, . . . ,Sm+l),a) =dcf 

{ (si,S2,-- - ,Si,6{si+i,a), . . . ,6{sm+i,a)) | < i < m + 1 } 
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• S'^ =dof { (si, •S2, • • • , Sra+i) \ (sj G S" ^ i odd) for i = 1, . . . , m + 1 }. 

We observe that (si,S2,... ,Srra+i) € '5m('S(f,w) if and only if there exist words 
wi,... ,Wm+i € A* such that if i ^ W2 ^ ■ • • ^ it'm+i ^ V and (5(50,1(^4) = for 
z = 1, . . . , m + 1. 

Now we can conclude: 

^=> there exist si, S2, . . . , Sm+i such that (si, S2, • • • , Sm+i) G (Jm(so', f ) n S'^ 
■^=> there exist tfi, . . . , Wm+i such that ^ ^^2 ^ • • • ^ Wm+i ^ ^ and 
((5(so, Wi) G S' ^ i odd) for i = 1, . . . , m + 1 

<^=^ u € L'^(m) 

Because of L^{m) = (m) we obtain that L^{m) is also regular. □ 



Corollary 3.6. If L O A* is regular and m > 0, then L^{m) and L (m) are in £1/2- 

In order to measure the number of inevitable alternations that occur with respect to a given 
language L we look for the maximal m such that the sets L~^{m) and L^{m) are not empty. 

Definition 3.7. For a language L C yl* we set m'^{L) =cief max{ m | L~^{m) 7^ } and 
m~{L) =(jef max{ m | L^{m) 7^ }. 



The following proposition is an immediate consequence of Proposition p. 3 . 

Proposition 3.8. For any language L A* it holds that 

1. m^{L) = 00 if and only ifm^{L) = 00, 

2. ifm^{L) < 00 then [m^(L) — m~(L)| = 1, and 

3. m^{L) = m^{L). 

4 A Criterion for Membership in £1/2 (^) 

The measure has already been used by J. Stern to characterize Ci = BC(>Ci/2)> i-C- the 
piecewise testable languages over alphabet A. 



Theorem 4.1 [Ste85|. A language L Q A* belongs to L\ if and only ifm^{L) is finite. 



Now we will relate the single classes of the boolean hierarchy over C1/2 to particular values 
of m,+ and m^. This theorem then has the preceding one as a corollary. 

Theorem 4.2. Let L C A* be a language and k > 1. 

1. L ^ Cii2{k) if and only if L is regular and m^{L) < k. 
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2. L ^ coCii2{k) if and only if L is regular and m (L) < k. 



Proof. We prove Statement 1; Statement 2 then follows immediately by Proposition 3.8.3. We 
restrict ourselves to the case of even k, the other case being proved analogously. 



Let L be regular and m'^{L) < 2k. Then L^{i) = for all i > 2k. By Proposition 3.4.1 we 
can write L as 



fc-i 



L=\J {L-{2i)\L-{2i + l)), 



i=0 



and Corollary |3^ shows that we can use Lemma g3[4 to obtain L G £1/2(2^) 



Now suppose L G Ci/2{2k). Then L is regular and again by Lemma there exist lan- 
guages Li, L2, . . . , L2k G Ci/2 such that Li D L2 5 • • • 5 -i^2fc and L = Ui=i(-^^2i-i\-Z^2i)- 
Setting Lo =def A* and L2fc+i =dcf we obtain L = Ui=o(-^2A-^^2i+i)- 

Assume that L+(2/c) 7^ 0. Then by definition of L^{2k) there exist w £ L, some v ^ A* 
and t(;o,if^i, • • • ,W2k £ ^* such that w = wq ^ wi ^ W2 ^ ■ ■ ■ ^ W2k ^ v with ?i;2j G L 
and W2i-i For any i G {0, 1, . . . , 2/c — 1} there must be two indices j, j' G {0, . . . , 2k} 
with G Lj\Lj^i and u/j+i G Lji\Lji^i. Since Wi G L 4^ tfj+i ^ these indices must be 



different. Note with Theorem g3| that (Lj)^ = Lj for all j. So from Wi ^ Wj+i we can conclude 
that Wi^i G as well, which implies j' > j. Consequently, the words wo,wi, . . . ,W2k are in 
2k + I different sets Lj\Lj^i with j > 1 (since wq ^ L C Li). This is a contradiction since there 
are only 2k such sets. Hence m^{L) < 2k. □ 

In the remainder of this section we will give two applications of the above criterion for mem- 
bership in Cii2{k.). First, we can conclude that the boolean hierarchy over C112 is a proper hierar- 
chy. 

Theorem 4.3. For every k > 1, 

Cy2{k)ZCy2{k + l). 

Proof. Fix some a £ A, and define \w\a to be the number of occurences of a in it; G A*. For 
k > 1 define 

1. M2fc_i =def {w e A* I \w\a is odd or \w\a > 2A; — 1 1, and 

2. M2k =def {w G A* I \w\a is odd and \ w\a < 2A; }. 



Obviously it holds that m {M^) = k and m+(Mfc) = /c — 1. By Theorem 4.2 we obtain G 



Ci/2{k)\coCii2{k), and by Lemma p7rt2 we get£i/2(fc) Cii2{k + I). □ 



Next we consider the decidability of the classes Ci/2{k). For a given dfa F, the equivalence 

L{F) G Ci/2{k) m'^{L{F)) < k given by Theorem can be used to obtain a decision 

7 

procedure for the question L{F) G £1/2 (^)- This follows from the next lemma. Here and in the 
sequel we assume that a regular language is given by a deterministic finite automaton. 

7 7 

Lemma 4.4. Given a dfa F and k > 1, the questions m'^ {L{F)) < k and {L{F)) < k are 
decidable in nondeterministic space k ■ log \F\. 
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Proof. Note that m+{L{F)) < k ^ L{F)+{k) = ^ L{Fk) = where Fk is the nfa con- 



structed in the proof of Lemma 3.5. Obviously, L{Fk) = is equivalent with the non-existence 
of a path between the starting state of and one of its accepting states. Hence, we have to solve 
the graph non-accessibility problem for the transition graph of F^ which is of size \A\ ■ 1^1^^+^. 
This can be done in co-nondeterminstic space log(|F|'^"'"^) = (fc + 1) • log |F| which is the same 



as nondeterministic space k ■ log |F| [hnm88, Sze87]. □ 



Theorem 4,5. For fixed k > 1, the decision problems for £^/2(^) co£i/2(^) ^''^ NL. 

We are able to decide the question m+(L(F)) < k for given dfa F and k > 1. However, 
this does not mean automatically that we are able to compute m+(L(F)) effectively. That this is 
indeed possible can be concluded from the following dichotomy-lemma by J. Stern. 



Lemma 4.6 [Ste85]. For a deterministic finite automaton F, 

m+{L{F)) < oo ^ m+(L(F)) < 2l^l-l^l'. 

This dichotomy enables us to compute the measure rn^{L{F)) simply by deciding the ques- 
tions m+(L(F)) < A; for = 1, 2, . . . , 2l^ll^P + i with help of Lemma 0. 

Theorem 4.7. The measures m'^{L) and m^(L) for a regular language L are computable in 

space 2^(1^1). 



Due to the close connection to the F0[<] -logic (Theorem 2.2) we immediately have the fol- 
lowing corollary. 

Corollary 4.8. The classes of the boolean hierarchy over the class Si of¥0[<]-logic are decid- 
able. 



5 A Pattern Characterization for Ci 

In this section we give a "forbidden-pattern" characterization of the class Ci (for other character- 
izations of this class see |Sim75, [Ste85 |). First we define significant patterns that lead to infinite 



alternating extension chains. The technically involved part in the proof of the following theorem 
is to show conversely that an infinite alternating extension chain implies the occurence of such a 
pattern. For this end we continuously select suitable infinite subchains of an infinite chain, we em- 
phasize on the position in a word where insertion of a letter leads to alternation and we extensively 
exploit the finiteness of an automaton. 

We say that the dfa F = (A, 5, (5, -S*') has the pattern Pi (cf. Figure [l]) if there exist 
v,x,y,z G A*, a G A and states si,S2,S3 € S such that ya ^ v, 5{so,x) = 6{si,v) = si, 
6{si,y) = S2, 6{s2,a) = S3 and6{s2,z) £ S' ^ S{s3,z) ^ S' . 

We say that the dfa F has the pattern P2 (cf. Figure |2|) if there exist u,x, z, z' £ A* ,a £ A and 
states si, S2, S3, £ S such that az ^ u, 5{so,x) = si, 6{si,a) = S2, S{si, z) = (5(s3, u) = S3, 
5{s2, z) = 5(s4, u) = S4 and 5{s^, z') £ S' ^ (^(s4, z') ^ S' . 
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Figure 1 : Pattern Pi with ya <v and s S' ^ s' ^ S' . 
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u 



Figure 2: Pattern P2 with az < u and s ^ S' <^ s' ^ S' . 
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Figure 3: Pattern P3 with ya < v or az < u, and s S' ^ s' ^ S' . 



We say that the dfa F has the pattern P3 (cf. Figure ^ if there exist u,v,x,y, z, z' £ A*,a ^ A 
and states si, S2, S3, S4, S5 G such that ya < v ot az < u, 5{sq, x) = 5{si,v) = si, 5{si,y) = 
S2, 6{s2,a) = S3, 6{s2,z) = (5(s4,u) = S4, 6(33, z) = S{s5,u) = S5 and 6(34, z') e S' ^ 

Si35,z')^S'. 

Theorem 5.1. Let F be a dfa and let F be a dfa such that L{F)^ = L{F). Then the following 
are equivalent: 

(1) L{F) G £1, 

(2) neither F nor F does have the pattern Pi, 

(3) neither F nor F does have the pattern P2, 

(4) F does not have the pattern P3. 
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In the proof we will make use of the following easy to see lemma. 



Lemma 5.2. Let {oj} be a sequence of real numbers such that < ctj < 1 and Ui ^ otj for 
i 7^ j. Then there exists an infinite monotonic subsequence of{ai}. 



Proof of Theorem ^ (2) ^ (1): Assume that L{F) Ci for some dfa F = {A, S, 6, sq, S'). 
We have to show that F has pattern Pi or any F with L{F)^ = L{F) has pattern P2. First 



we conclude with Theorem 4.1 that m'^{L{F)) is infinite and we can assume w.l.o.g. that there 
exists an infinite sequence of words {wj} and a letter a ^ A such that Wj < Wj+i for all j > 1, 
and W2i-i = w[wl, W2i = w[aw" , 6{sQ,W2i~i) S' and S{sQ,W2i) € S' for all i > 1. 
Next we introduce markers rrn at the positions where a is inserted when going from W2i-i to 
W2i, i.e. the word w'^aw'- has markers mi,m2, ■ ■ ■ ,mi. To show the existence of an infinite 
subsequence of words which is monotonic with respect to the insertion positions of the letter a, 
we inductively attach values G M to each marker as follows: Let aj+i =def (A+i +7i+i)/2 
with =dcf max {{aj | I < j < i and marker rrij is left to mj+i } U {0}) and 7^+1 =def 
min {{aj \ 1 < j < i and marker mj is right to m-j+i } U {1}). We observe that is left to 



rrij if and only if Oj < aj. Now Lemma 5.2 tells us that there is an infinite strictly monotonic 
subsequence of {aj}. We distinguish two cases. 

Case 1. Assume that there exists an infinite strictly increasing subsequence of {oj}, i.e. there 
is a mapping r : N ^ N such that T{i) < T{i + 1) and a,-(j) < aT-(i+i) for all i > 1. For simplicity 
we redefine Wi =def Wr(i) ^^id summarize the properties of the sequence selected in this way. For 
alH > 1 we have 

1. W2i-1 = w[w'- ^ w[aw'- = W2i ^ W2i+1 = W'+i^y-'+i, 

2. w[a ^ w[^^, 

3. 5{so, W2i-i) 5"' and 5{so,W2i) G S' . 

We use the sequence {w[} as a starting point for subsequent selections of sequences {wi^k} for 
A: = 0, 1, 2, . . . all having the properties stated in the following claim. Using the finiteness of the 
set of states will then enable us to find the pattern Pi in F. In the following notations a superscript 
in combination with a subscript denotes an index. 

Claim. For every A: > there exists a state Sk G S and an infinite subsequence {wi^k} of {w[} 
such that for all k, i there are words v]^,vfj^, . . . , vfi^,Ui^k £ A* with 

a. Wi^k = vlkavf ka ■ ■ ■ av^ f^aui^k, 

b. vl f^ ^ vl_^_-^^^ for 1 < j < k, 

c. ui^k-i ^ v^k foJ" k>l, 

d. Ui^ka ^ Uj+i,fc, and 

e. 6{so, vlk^vl^a ■ ■ ■ avj ^^a) = Sj for 1 < j < k. 
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Proof of claim. We proceed by induction on k. The case A; = is easy to see with q =def ^i,o = 
w\. Starting with {wi^k} we show how to select a subsequence {wi^k+i} fulfilling the assertions 
of the claim. First we observe that we can conclude from Uj ^ iti+i,fe that uij^a < ui^k for 
all i > 2. Now for every i > 2 we can identify in Ui^^ ^ word left (right, resp.) of this particular 
letter a, i.e. there are words v'^'^^ and u[ ^ such that Ui^k = ''^it^C'U^ ^, ui^k ^ "wf ^ ^ ^ ''^i+ik 
u'i^k'^ ^ ""i+i.fc- Hence we can write each Wi^k as Wi^k = "^IkC^vf^k^ ■ ■ ■ av^^j.av'y'^^ au-^j.. Due to the 



finiteness of the set of states of F we can conclude that there exists a state Sk+i € S and a strictly 
increasing mapping r : N ^ N such that 5{sq, vl^^ i^av^^^ f^a ■ ■ ■ av^^^i^^k^v^^ilk^) = ^k+i- Now 

we define wi^k+i =dcf ^«T(0,fc' ^i,k+i =dci K(i),k for 1 < i < A; + 1 and Ui^k+i =def «r(i),fc- 
leave the verification of the assertions a to e for {wi^k+i} as an exercise. (End proof of claim) 

We keep the notations of the claim. Now, again due to the finiteness of S there exist k, m with 
l<A;<m< 15*1 + 1 and Sk = Sm- Hence we can define x =def ^im-i'*'"im-i'^ ' ' ' •^'"im-i"' 

^ =dcf v^J^av-^J^a ■ ■ ■ avX'.^^fi, y =dcf v-J^_-^av^J^^^a - ■ ■ av^„^_^aui^rn-i, and z =dcf w^, 
where r is the index such that w\^ra-i = Wr- Note that xy = w^. We conclude with the assertions 
of the claim, that 

JUil ™ 1 

^ ^"aU-i" • • • «<(r)!™-i«^r:m« 



Moreover we see that 5{sQ,xyz) = 5{sq,w'^w'^) = S{so,W2r-i) ^ S' and d{so,xyaz) = 
(5(so, w'^aw'l) = 5{sq, W2r) G S' . This shows that F has pattern Pi. 

Case 2. Now assume that there exists an infinite strictly decreasing subsequence of {ai}. Then 
obviously {wj^} is an infinite alternating extension chain with respect to L{F)^. Let F be a dfa 
accepting L{F)^. Attaching markers in the same way as above leads to = 1 — and hence 
there is a strictly increasing subsequence of {a^}. We can conclude as in case 1 that F has pattern 
Pi. This finishes the proof of (2) ^ (1) and we turn to the remaining impUcations. 

(1) (4): Suppose some dfa F has pattern P3. Then we have for i > the infinite alternating 
word extension chain xv^yzu'^z' ^ xv^yazu^z ■< xv^^^yzu'^^^ z since either ya ^ v or az < u. 

(4) =^ (3): If some dfa F has pattern P2 then this is also a pattern P3 (with v = y = e), 
which is a contradiction. Next we show that if some dfa F has pattern P2 then any dfa F with 
L{F) = L{F)^ has pattern Pi, and again this is also pattern P3 (with u = z = e), a contradiction 
as well. So suppose that a dfa F = {A, S, S, so,S') has the pattern P2 witnessed by x, z, u, z G A* 
and a G ^. Let F = (A, S, 6, so,S') be any dfa with L(F) = L{F)^ and choose m,k with 
m > A; > such that (5(so,(2')^(m^)'') = 5{so,{z')^{u^)''+"'). We define x =def {z')^{u^)'', 
V =def (u^)"^, y =def and z =def x^. Now one can easily verify that x,v,y,z G A* and 
a e A give rise to pattern Pi in F since ya ^ v follows from az ^ u. 

(3) ^ (2). Suppose that a dfa F = 5,(5, so,-?') has pattern Pi witnessed by G A* 

and a G A. Let F = {A, S, 6, so, S') be any dfa with L(F) = L{F)^ and choose m, A; G N with 
m> k> such that (5(.so, (yz)^(t;-^)'') = 5{so,{yz)^{v^)''+"') and 5{so,{yaz)^{v^)'') = 
S{so, (yaz)«(i;^)'=+™). We define x =def u =def {v^T, z =def y^v^)'' and z' =def x^. 
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Again, one can easily verify tliat x, u,z,z G A* and a G A give rise to pattern P2 in F since 
az ^ u follows from ya <v. □ 

We remark that the proof of (2) ^ (1) even shows that the automata F and F do not have the 
two instances of pattern Pi with s £ S' and s' S' on one hand, and s' G S' and s ^ S' on the 
other hand. The same holds analogously for the other patterns. To see this note that we can start 
the whole investigation at the very beginning of the proof with the sequence {wj^i}. 

Using the above Theorem we obtain a co-NL(=NL)-algorithm for the decision problem for 
Ci simply by testing the occurence of the pattern P3 in a given dfa. This algorithm is completely 



different from those which follow from the characterizations in [Sim75, Ste85]. Note that S. Cho 



and D.T. Huynh proved in [CH91| that the decision problem for Ci is even NL-complete. 



6 Complexity Theoretical Consequences 

Let a nondeterministic polynomial time Turing machine M output on every path a symbol from 
A and assume a fixed ordering on the set of all paths. We additionally assume here that, given 
some input x and the number of a path i, one can compute in polyomial time the output of M 
on path i (balanced computation tree). This leads in a natural way to the notion of the leafstring 
of M on some input x when concatenating the output symbols of M's computation tree. Now a 
language L A* gives rise to the class Leaf^(L) of all languages L' for which there is a machine 
M of the above type such that for all x it holds that x £ L' if and only if the leafstring of M on 
input x belongs to L. Furthermore, for some class C, denote by Leaf^(C) the union of all classes 
LeafP(L) with L G C. 

As stated in the introduction this leaf language approach led to new insights into the structure 
of complexity classes between P and PSPACE. However, most results deal with classes of leaf 
languages and an important question is what complexity classes are definable by a single leaf 



language. Some progress in this direction has been made in [ ]Bor95| , |BKS98[ ]. 



Due to the close connection of the classes of the Straubing-Therien hierarchy to FO[<]-logic 



(Theorem 2.2) we can make use of the known relationship between languages definable within 



this logic and the classes of the polynomial time hierarchy. 

Theorem 6.1 [ BV98| ]. Let A be an arbitrary alphabet with \A\ > 2 and let k > I. 



1. SP = Leaf^(A-i/2) 

^1/2) 



2. nP = LeafP(co£fc_ 



The "forbidden-pattern" characterization of the classes £1/2 from [PW97] enables us to show 



which complexity classes are exactly definable by a single leaf language from this class. 

Theorem 6.2. For an arbitrary alphabet A with \A\ > 2 we have 

{LeafP(L) | L G £1/2} = {{0},{S* | 5 finite alphabet}, P, NP} 

and given some dfa accepting a language L G £1/2 one can effectively determine the class on the 
right hand side with which Leaf^(L) coincides. 
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For single leaf languages from the boolean hierarchy over L^j^ the situation is a lot more 
complicated. However, we have the following "union-style" theorem which provides an upper 
bound for complexity classes definable via such leaf languages. Throughout the paper we studied 
the classes Cij2{k) for an arbitrary but fixed alphabet A. Now we will emphasize on the chosen 
alphabet and denote by C^j^^k) the classes Ci/2{k) defined for languages over A. 

Theorem 6.3. For any k > 1, 

A finite alpiiabet 



Proof. To see the inclusion from right to left note with Theorem 6. 1 . 1 that Leaf — ^'^^ 

any alphabet A. Furthermore it holds for languages Li, L2 that LeafP(Li U L2) C LeafP(Li) V 
Leaf^(L2), Leaf^(Li n L2) C Leaf^(Li) A Leaf^(L2) and Leaf^(Li) = coLeaf^(Li), where 
Ci V C2 =dcf { L[ U L'2 I L[ e Ci, L'2 G C2 } and Ci A C2 =dcf { L[ n | L[ G Ci, L'2 e C2 } 
for classes Ci, C2. 

For the other inclusion define for A; > 1 the alphabet Af^ =def {0, 1,2,... ,k} and the lan- 
guage Lk =def { '"^ ^ ^fc I max{z & A^ \ i ^ w} is odd}. One can show with Lemma 2.1 



that Leaf^(Lfc) = NP(fc). Observe that m~^{Lk) = fc — 1, so with Theorem <L2 it follows that 

.Ak , 
-1/2^ 



Lk G CfUk). □ 



Corollary 6.4. If m'^{L) < kfor a regular language L then Leaf^(L) C NP(A;). 

Note that the measure is computable (Theorem p?7| ). Moreover the results obtained here 
remain valid if we omit the restriction that the computation tree of a Turing machine must be 
balanced. 



Finally we compare our results with related work. In [ ]CHVW98| ] the case of commutative 
leaf languages has been studied, i.e. the case where membership to a language depends only on 
the numbers of occurences of the alphabet symbols. For an oracle D we denote by the rela- 
tivized version of a complexity class C. It has been proved in the mentioned paper that for every 
commutative language L, 

m+{L) <k^ yD{Leaf^{L)^ C NP(A;)-^). 

Furthermore, other (stronger) measures n+ and have been defined, i.e. n+(L) < m'^{L) and 
n~{L) < m~{L), and it has been proved that for every commutative language L, 

n-{L) >k^ VD(LeafP(L)^ D NP(A;)^). 
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